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1 This work attains a threefold objective: first, we derived the richness-mass scaling in the local Universe from data of 53 clusters with 

, individual measurements of mass. We found a 0.46 ± 0.12 slope and a 0.25 ± 0.03 dex scatter measuring richness with a previously 

developed method. Second, we showed on a real sample of 250 0.06 < z < 0.9 clusters, most of which are at z < 0.3, with 
spectroscopic redshift that the colour of the red sequence allows us to measure the clusters' redshift to better than Az = 0.02. Third, 
we computed the predicted prior of the richness-mass scaling to forecast the capabilities of future wide-field-area surveys of galaxy 
clusters to constrain cosmological parameters. To this aim, we generated a simulated universe obeying the richness-mass scaling that 
\^») . we found. We observed it with a PanStarrs 1+Euclid-like survey, allowing for intrinsic scatter between mass and richness, for errors 

■ on mass, on richness, and for photometric redshift errors. We fitted the observations with an evolving five-parameter richness-mass 

scaling with parameters to be determined. Input parameters were recovered, but only if the cluster mass function and the weak-lensing 
redshift-dependent selection function were accounted for in the fitting of the mass-richness scaling. This emphasizes the limitations 
of often adopted simplifying assumptions, such as having a mass-complete redshift-independent sample. We derived the uncertainty 
and the covariance matrix of the (evolving) richness-mass scaling, which are the input ingredients of cosmological forecasts using 
cluster counts. We find that the richness-mass scaling parameters can be determined 10 5 better than estimated in previous works that 
did not use weak-lensing mass estimates, although we emphasize that this high factor was derived with scaling relations with different 
parameterizations. The better knowledge of the scaling parameters likely has a strong impact on the relative importance of the different 
I ■ probes used to constrain cosmological parameters. The fitting code used for computing the predicted prior, inclusing the treatment of 

the mass function and of the weak-lensing selection function, is provided in the appendix. It can be re-used, for example, to derive the 
\ predicted prior of other observable-mass scalings, such as the L^-mass relation. 

CZ2 . 
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' jjj ■ 1 . Introduction Most previous forecasts dealing with counts of galaxy clus- 

r/-) ' , , . , . ters (e.g. Lima & Hu 2005, Sartoris et al. 2010, Carbone et al. 

g ; If one has a sample of N clusters with measured properties, assumed ^ ision with which the parameters of the 

^ . obsm, Zi (where i = 1,2,..., AO, for example in a Euclid-like mass . observable scaling will be known instead of measuring it. 

^ ■ survey their constraints on the cosmological parameters 6 = Qne of ±e es of this work is t0 tif this t of the 

. ] (fi M Q A , o- 8 , w, ...) can be derived by applying Bayes s theorem inference st we aim t0 ute the uncertai nties of the mass- 

^ , to obtain the posterior distribution of the cosmological parame- observable scali Le _ the volume of the 

mass-richness scal- 

p^j , ' ing parameter space enclosed by the posterior probability dis- 

t— I ■ ro\ u \ i u im ia\ l^\ tribution. We consider, specifically, cluster richness as the mass 

. p(0\obsni,Zi) p(obsn h Zi\6)p(6) , (1) '.^ J '. . 

proxy. This analysis gives us the input prior of cosmological 

• i—i ■ where p(ff) is the prior on cosmological parameters (e.g. from forecasts using cluster counts. 

^ ; other surveys) and p(obsrn, Zi\6) is the likelihood of measuring N The paper is organized as follow: in Sect. 2 we measure the 
H ! clusters with measured properties obsn u z, . If the mass M were mass-observable relation in the local Universe from real data, we 
. . . ■ observable, the cosmological parameters 6 would be constrained determine how well the cluster redshift can be inferred from the 
by fitting p(M, z\9) to the observed distribution. However, this colour of the red sequence and we compute in which part of the 
direct fit is not possible with survey data, because one needs to universe the observable can be measured with current data. In 
rely on an observable (mass proxy), such as richness or Y x , and Sect. 3 we assume a fiducial model where the relation between 
fit the distribution of the observable, O with p(0,z\M). To esti- mass and proxy does not evolve. We populate an (simulated) 
mate the cosmological parameters, one needs to assume a model observable universe, we measure the parameter uncertainties by 
for the scaling between the mass and the observable (usually a fitting an evolving mass-observable relation to all data (real and 
power law) and some knowledge about how precisely the pa- simulated), and we test our ability to recover an evolving mass- 
rameters describing this relation are known (the mass-observable observable relation. Finally, in Sect. 4 we discuss our results and 
prior): the knowledge may range from very precise (a delta func- compare the measured uncertainties of the mass-observable scal- 
tion prior) to very uncertain (e.g. an improper uniform prior). Of ing with what has been thus far assumed in cosmological fore- 
course, cosmological estimates benefit from better known seal- casts. Sect. 5 summarizes the results of this work, 
ing parameters, i.e. priors that enclose a narrow volume of the Throughout this paper we assume H« = 0.3, Qa = 0.7, 
parameter space that describes the mass-observable scaling. Hq = 70 km s _1 Mpc~\ cr 8 = 0.8. Magnitudes are quoted in 
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Fig. 1. Computed selection function (histogram) and its adopted 
Gaussian approximation (curve). 




Fig. 2. Distribution of the expected obslgM200 fake data (his- 
togram) and distribution of real data (points). Errorbars mark 
count standard deviation (i.e. are V«), not the error. 

their native system (quasi-AB for SDSS magnitudes). All loga- 
rithms in this work are on base ten, unless otherwise indicated. 
All quantities are measured at the r2ooe radius, whose enclosed 
averaged mass density is 200 times the critical density. The 
richness-mass calibration in this paper refers to richnesses mea- 
sured following the Andreon & Hurn (2010) prescriptions, and 
therefore cannot be used for other types of richnesses, e.g. Abell 
(1958) richnesses. We adopt the standard statistical notation: the 
~ symbol reads "is drawn from" or "is distributed as" and the <— 
symbol reads "take the value of. 

2. Calibration of the mass-proxy from current data 

2.1. Local calibration of the richness-mass relation based on 
real data 

In this section, we are interested in the scaling between richness 
and mass in the local Universe taking into account the noise in 
their measurement and selection effects. 

We re-analysed the very same data that were used in Andreon 
& Hurn (2010), adopting the modeling appropriate for the task 
of current interest. In short, the data consist of cluster richnesses, 
n200, based on red galaxies measured on specified luminosity 
and colour ranges within a fiducial radius, and masses derived 
from the caustic technique computed using 208 galaxies on av- 
erage per cluster for 53 galaxy clusters at 0.03 < z < 0.1. As de- 



tailed in Andreon & Hurn (2010), the parameters describing the 
mass-richness relation do not change if we use instead velocity- 
dispersion-based masses. We emphasize that we used the values 
denoted with a hat in Andreon & Hurn (2010) because they are 
derived without knowledge of the mass-related quantities (r>ooX 
precisely like in real survey data. For notation simplicity, we here 
suppress the hat notation adopted there. 

Because it is an X-ray selected sample, the considered clus- 
ter sample is controlled, not random; therefore, bright clusters 
are over-represented. In general, a non-random selection causes 
biases in the recovered regression parameters if the selection is 
neglected (Gelman et al. 2003; Stanek et al. 2006; Pacaud et al. 
2007; Andreon, Trinchieri & Pizzolato 201 1 ; Andreon & Moretti 
2011; Andreon & Hurn 2012; and see also sec 3.2 where we 
discuss this problem at length for a sample for which the non- 
random selection cannot be ignored). To be precise, the studied 
cluster sample is a random sampling (as detailed in Andreon & 
Hurn 2010) of an X-ray selected sample. Its controlled nature al- 
lows us to compute the mass selection function, which is essen- 
tial, in general, to correct for non-random mass selection leading 
to biases in the recovered regression parameters. We computed 
the mass selection function (mass prior) as follows: we assumed 
that the local cluster mass function is described by a Jenkins et 
al. (2001) mass function at the masses of interest (log M > 13.5 
M ). Our results are independent of the chosen parametrization 
(e.g. if Press & Schecther (1974) would be adopted). We then 
followed Stanek et al. (2006): the mean relation between the X- 
ray luminosity and the mass has a slope equal to 1 .59, intercept 
equal to lnLx\s = 1.34 (in a system employing different Hubble 
constant conventions for luminosity and mass), intrinsic scatter 
of 0.59, and the distribution of the (neperian In) X-ray luminos- 
ity at a given mass is Gaussian, i.e. 

ltiL X j ~ JV(1.59(/gM200, - 15) + 1.34, 0.59 2 ) . (2) 

Thi£] allows us to populate a simulated local universe, 
0.03 < z < 0.1, with clusters of X-ray luminosity InLxj. The 
flux of these (simulated) clusters is computed and the objects 
are kept in the sample if fx > 3 10~ 12 erg s crrT 2 , which is the 
flux threshold adopted by Rines & Diaferio (2006), the parent 
sample from which Andreon & Hurn (2010) studied a random 
subsample. Fig. 1 shows the result of this simulation, and the 
adopted analytic (Gaussian) parametrization: 

JV(14.5,0.33 2 ) . 



IgMlQQ 



(3) 



Assuming Eq. 3, we computed the expected distribution of 
the observed values of lgM200, obslgMlQQ of our simulated sur- 
vey, assuming a common error for the mass error, 0.14 dex, the 
average value of the studied sample. We compared this to the 
actual observed distribution (i.e. real data) in Fig 2. The agree- 
ment is impressive (there are no free parameters to tune), show- 
ing that our modelling of the selection function captures the data 
behaviour and gives us p{lgM2QQ) i.e. the probability that a clus- 
ter has mass IgMlQQ and is included in the sample (i.e. the mass 
prior). The derived p(lgM20Q) allows us to avoid the biases com- 
ing from the non-random mass distribution of our sample. 

We proceed by specifying the assumed mathematical depen- 
dence between the quantities involved in our problem. We need 
to acknowledge the uncertainty in all measurements and there- 
fore, because of errors, observed and true values are not iden- 
tically equal. The variables «200, and nbkgt represent the true 



1 The tilde symbol indicates a similarity subject to stochasticity, ei- 
ther because of noise or because of intrinsic differences among mem- 
bers. Broadly speaking, the tilde symbol indicates that we account for 
uncertainty or non-homogeneity (variety). 
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richness and the true background galaxy counts in the studied 
solid angles. We measured the number of galaxies in both clus- 
ter and control field regions, obstotj and obsbkgi respectively, 
for each of our 53 clusters (i.e. for i = 1, . . . , 53). We allowed 
Poisson errors for both and we assumed that all measurements 
are conditionally independent. The ratio between the cluster and 
control field solid angles, C,, is known exactly. In formulae: 

obsbkgi ~ Pinbkgd (4) 
obstoti ~ P(nbkgi/Ci + n200,) . (5) 

For each cluster, we have a cluster mass measurement and a 
measurement of the error associated with this mass, obslgMlOO, 
and obserrlgM20Qj respectively. We allowed Gaussian errors on 
mass: 

obslgMlOOi ~ N(lgM200 h obserrlgMlOO]) . (6) 

We assume a power law relation between mass and «200 
with intercept a + 1.5, slope /? and intrinsic scatter cr scat : 

lgn200i ~ N(a + 1 .5 + /?(log(M200,-) - 14.5), (r 2 scat ) (7) 

The quantity log(M200) is centred at an average value of 
14.5 and a is centred at 1.5, for computational advantages in 
the MCMC algorithm used to fit the model (it speeds up conver- 
gence, improves chain mixing, etc.) and to reduce the covariance 
between parameters. The relation is between true values, not be- 
tween observed values, which may be biased. 

The priors on the slope and the intercept of the regression 
line in Equation 7 were taken to be quite flat, a zero mean 
Gaussian with very large variance for a and a Students-f dis- 
tribution with one degree of freedom for (3. The latter choice was 
made to avoid that properties of galaxy clusters depend on as- 
tronomer rules of measuring angles (from the x or from the y 
axis). This agrees with the model choices in Andreon (2006 and 
later works). Our t distribution on p is mathematically equivalent 
to a uniform prior on the angle b. In formulae: 

a ~ JV(0.0, 10 4 ) (8) 
P~ h . (9) 

For the true values of the background, we choose to impose 
no strong a-priori values, only enforcing positivity, by adopting 
an improper uniform prior, 

nbkgi ~ 1/(0, oo) . (10) 

Fitting our sample of 53 clusters with the model above, we 
found: 

lgn200 = (0.47 ± 0.12) (/gM200- 14.5) +1.58 ±0.04 . (11) 

Unless otherwise stated, the results of the statistical computa- 
tions are quoted in the form x ± y where x is the posterior mean 
and y is the posterior standard deviation. All statistical computa- 
tions were performed using JAGS (Plummer 2010), see the ap- 
pendix for an example. 

Figure 3 shows the scaling between richness and mass, the 
observed data, the mean scaling (solid line), and its 68% un- 
certainty (shaded yellow region) and the mean intrinsic scatter 
(dashed lines) around the mean relation. The +1 intrinsic scat- 
ter band contains 60 % of the data points and is not expected to 
contain 68% of them, because of measurement errors. 

Figure 4 shows the posterior marginals for the key parame- 
ters, i.e. for the intercept, slope, and intrinsic scatter <r scat . These 
marginals are reasonably well approximated by Gaussians. The 
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Fig. 3. Richness-mass scaling. The solid line marks the mean 
fitted regression line of log{n200) on lgM20Q, while the dashed 
lines show this mean plus or minus the intrinsic scatter <r scat . The 
shaded region marks the 68% highest posterior 68 % credible 
interval for the regression. Error bars on the data points represent 
observed errors for both variables. The distances between the 
data and the regression line is due in part to the measurement 
error and in part to the intrinsic scatter. 



intrinsic mass scatter at a given richness, cr scat = o"; gAf 200|iogn200> 
is small, 0.25 ± 0.03 dex. These posterior probability distribu- 
tions are dominated by the data (their widths are much smaller 
than the prior widths), i.e. our results are independent of the as- 
sumed prior to all practical effects. Parameters show no appre- 
ciable covariance (figure not shown) because of our choice of 
zero-pointing masses near the data average (eq 7). This allows 
a simpler summary of the posterior, which we use in our next 
inference step (eq 17 to 19). 

We note that these results are almost indistinguishable from 
results we might obtain without modelling the selection func- 
tion, basically because the prior is broad compared to lgM2Q0 
errors. 

2.1 .1 . Side comments 

Cosmological forecasts dealing with cluster counts in the opti- 
cal sometimes use the scatter between observable and mass from 
Rykoff et al. (2012) or Rozo et al. (2009). It is worth empha- 
sizing that to measure the scatter between two quantities, it is 
strongly preferable to have both. Neither of these two works have 
individual values of cluster masses. 

It is worth remembering that the slope of the direct rela- 
tion is not the inverse of the slope of the inverse relation, i.e. if 
O oc M y , then usually M c/t O lly (e.g. Isobe et al. 1990, Andreon 
& Hum 2010). Therefore, it is not surprising that the slope be- 
tween mass and richness is not the reciprocal of the slope deter- 
mined in Andreon & Hum (2010) for the inverse relation using 
the very same data. Furthermore, the slope depicted in Figure 3 
is not "too shallow" compared to the data, a steeper slope would 
systematically over- or under-estimate the cluster richness (see 
Andreon & Hum 2010, 2012 for a brief astronomical introduc- 
tion on regression fitting). 
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Fig. 4. Posterior probability distribution for the parameters of the richness-mass scaling computed from the real data. The black 
jagged histogram shows the posterior as computed by MCMC, marginalised over the other parameters. The red curve is a Gaussian 
approximation of it. The shaded (yellow) range shows the 95% highest posterior credible interval. 



2.2. In which part of the Universe is richness measurable 
with current data? 

The cluster richness was derived using g — r colour and lumi- 
nosities of galaxies brighter than an evolving limiting magnitude 
My < -20. 

Figure 5 illustrates how depth and colour constraints change 
with redshift. The top panel illustrates the apparent luminosity 
of a red M e v = -20 mag galaxy, modelled as a Zf = 5 single stel- 
lar population using the 2007 version of the Bruzual & Chariot 
(2003) synthesis population model for different filters: g, r, ;, and 
z for the 3n Steradian PanStarrs 1 survey (PS1, hereafter) and 
riz, Y, J, and H for EuclicQ (Laureijs et al. 201 1) with the cor- 
responding ~ 10cr depth (horizontal ticks). For the 3n PanStarrs 
1, we took the current depth, i.e. that already achieved after the 
first two years of observation (Kaiser, N., private communica- 
tion). The PS1 has a Y-like filter, not plotted because it is shal- 
lower than the Euclid Y. The Dark Energy Survey (Abbott et al. 
2005) is deeper than PS1, but covers a smaller solid angle. The 
Euclid consortium plan to have ground based griz data deeper 
than our need over the whole 15000 deg 2 survey area (Laureijs 
etal. 2011). 

The bottom panel illustrates the wavelength range sampled 
by these filters. Only redshift bins where galaxies are brighter 
than the 10 cr depth are plotted. The shaded yellow is the A range 
sampled by g - r at z < 0.08. As the figure shows, we always 
have at least two filters in the shaded region, i.e. up to z — 1 at 
least these data have appropriate depth and wavelength coverage 
to count galaxies. Indeed, the My < -20 mag cut was chosen to 
precisely perform this measurement on ten-year old MOS AIC-II 
CTIO images up to z — 0.82 (e.g. those in Andreon et al. 2004a). 
These depths are routinely achieved in current surveys, such as 
the CFHTLS (Cuillandre & Bertin 2006). 

To summarize, incoming (and also current) surveys have the 
depth and filter coverage adequate to compute the number of red 
galaxies needed to derive «200. Furthermore, Andreon (2012) 
showed that the galaxy background (nbkgi/Cj in eq. 5) is negli- 
gible even at magnitudes fainter than those adopted in this work, 
and not detrimental at all for the derivation of the cluster rich- 
ness. 



2.3. Which precision for photometric redshift? 

Surveys such as those performed by PanStarrs 1, DES, or Euclid 
will detect thousands of clusters and it is unreasonable to expect 
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Fig. 5. Depth and wavelength coverage of the two-year PS 1 and 
Euclid surveys. Upper panel: g, r, i, and z (from left to right) fil- 
ters are indicated with dashed (blue, green, red, and black) lines. 
riz, y, J, and H (from left to right) Euclid filters are indicated 
with thick solid (blue, green, red, and black) lines. The horizon- 
tal tick indicates the ~ 10cr depth, most of them are at z > 1 and 
thus not visible in the plot. Bottom panel: Wavelength coverage 
of the filters for redshift bins where galaxies are brighter than the 
~ 10cr depth. The shaded (yellow) region marks the wavelength 
sampling of g - r at z ~ 0. 



that all of them will have a spectroscopic redshift. How precise 
will their redshift estimate be? We can set a conservative esti- 
mate by considering current shallower surveys that sample sim- 
ilar redshifts. 

We considered spectroscopic and photometric redshifts of 
the sample of 228 clusters at 0.06 < z < 0.3 in Andreon 
(2003a,b) and the 16 0.3 < z < 0.9 clusters in Andreon et 
al. (2004a, 2004b). They are all colour-detected with the red- 
sequence method of Andreon (2003a), which is an adaptation 
of the Gladders & Yee (2000) original method (see Andreon 
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Fig. 6. Red sequence photometric redshift performance. 
Spectroscopic redshift vs photometric redshift from the colour 
of the red sequence in g - r (small open points, error bars are 
not plotted to avoid crowding, there are 228 plotted points, 
sometime one on top of the other) or R - z' (solid points with 
error bars). The z P hot = Zspec nne an d the z P hot = z spec ± 0.02 loci 
are indicated with solid lines. 

2003a for details) in either the SDSS early-data release area or 
in the XMM-LSS field. These clusters tend to be of low rich- 
ness and therefore to have a less prominent red sequence than 
that of the massive clusters that we consider below. For both 
samples, the colour of the red sequence was determined us- 
ing two-band photometry only, g - r (at z < 0.3) or R - z' 
(at z > 0.3). The photometric redshift was derived from the 
colour of the red sequence adopting a relation between redshift 
and colour (an empirical template at z < 0.3, an old galaxy 
template at higher redshift, as detailed in Andreon 2003a and 
Andreon et al. 2004a, respectively). Fig 6 shows z p ] wt vs z spec 
for the 244 clusters, the (straight line) z p hot = z spe c line and the 
z P hot - z sp ec ± 0.02 loci. Twenty-five percent of the points have 
\z P hot - z sp ec\ > 0.02, while > 32 % are expected if the photo- 
metric error is 0.02. Even restricting the attention to z > 0.3, 6 

clusters show \z p i wt - Z S pec\ > ^0.02 2 + err 1 ^ vs 5.1 expected 

cases if the redshift derived from the red sequence has an intrin- 
sic scatter of 0.02. This implies that we can already achieve a 
Az = 0.02 precision using the colour of the red sequence using 
two bands. Similar results were found by Puddu et al. (2001) for 
a small, but X-ray selected (and therefore more massive) clus- 
ter sample, and by High et al. (2010) for a small, but mostly at 
z > 0.3, cluster sample. In both cases the estimate of the clusters' 
redshift is based on the colour of the red sequence. 

The extremely good performance of the red sequence colour 
as a redshift indicator is hardly surprising because of the implicit 
selection of one single type of galaxies with a distinctive 4000 A 
break (spectrophotometric bright early-type galaxies) and of the 
colour homogeneity of the early-type galaxy class (e.g. Stanford, 
Eisenhardt, & Dickinson 1998, Kodama et al. 1998, Andreon 
2003a,b, Andreon et al. 2004a). 

In summary, we can safely assume for future clusters a (con- 
servative) 0.02 error on cluster (photometric) redshifts, because 
this performance is already achieved today using the colour of 
the red sequence. 



3. Calibration with future surveys 

3.1. Generation of mock-calibration Euclid data 

We generated a Monte-Carlo simulated universe obeying to the 
mass-richness scaling we just computed and observed it with a 
PanStarrs 1+Euclid-like survey. Our fiducial universe has un- 
evolving parameters that describe the mass-richness scaling. A 
Euclid-like survey is needed to measure cluster masses, whereas 
for the computation of cluster richness one needs shallower, but 
multicolour data, such as already acquired by the PanStarrs 1 
survey. 

We followed Berge et al. (2010) to compute the number (the 
probability times the volume) of clusters in the Euclid-wide sur- 
vey at redshift z, with mass IgMlQQ, which produces a weak- 
lensing signal with a given signal-to-noise ratio S IN. We used 
the halo model with an NFW (Navarro, Frenk & White 1997) 
profile, a Jenkins (2001) mass function, and a modified Sheth, 
Mo & Tormen (2001) bias (see the Berge et al. 2010 appendix 
for a detailed description). We assumed a galaxy shape noise 
<r int = 0.3, and a galaxy number density n g = 30 arcmkr 2 . We 
also assumed that all halos are spherical and therefore did not 
account for the shape bias described by Hamana et al. (2012). 
Projection effects are, in these observational conditions and for 
clusters as massive as those of interest in our paper, largely sub- 
dominant (Marian & Bernstein 2006), and were neglected for 
this reason. For the Euclid survey, we adopted the updated sky 
coverage (15000 deg 2 ). The iso-density contours in Fig. 6 in- 
dicate lines where we expect 1, 10, and 100 clusters with an 
S /N > 5 per bin of 0. 1 dex in mass and 0.0275 in redshift in the 
Euclid survey (15000 square degrees). The minimal S/N = 5 
mass, lgM200trunc, is well described by 

IgMimruc = 13.9891 + 1.04936z + 0.48888k 2 . (12) 

We exploited these masses to calibrate the richness-mass relation 
and its evolution. 

First of all, we generated a Monte-Carlo realization of the 
Berge et al. (2010) distributions. Then, we selected S/N > 5 
detections only, because we did not want to deal with too 
noisy measurements (Hamana et al. 2012; Pace et al. 2007). 
Furthermore, we removed clusters at z < 0.03 to avoid very 
nearby clusters with galaxies bright and large, whose photom- 
etry will likely be corruptee^. This left us about 11000 clusters 
with available z„ /gM200„ and (S /N)i- 

Cluster masses were then observed, i.e. mass errors were 
taken to be Gaussian and equal to errlgM200j = -^//n(10), 
where the latter term is due to our choice of measuring errors 
using decimal logarithms: 

obslgM200i ~ N(lgM200i,errlgM200^) . (13) 

Cluster richnesses were assigned to simulated clusters as- 
suming the model measured in the local Universe, i.e. (Sec 2.1) 

lgn200i ~ 7V(0.47 (ZgM200, - 14.5)+ 1.58, 0.25 2 ) . (14) 

We emphasize, once more, that we allowed for an intrinsic 
scatter, i.e. we allowed clusters of a given mass to have a variety 
of richnesses. Richnesses were then observed: richness, as all 
measurements in this paper, have errors, which were assumed to 
be Poissonian, 

obslgn200i ~ <P(lgn200i) . (15) 

3 For example in the SDSS, which is much shallower and therefore 
less tailored for faint galaxies, photometry of galaxies at z < 0.02 suffers 
from shredding problems. 
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Fig. 7. Contours: number of clusters for which weak-lensing 
mass estimates can be obtained by a Euclid-like survey. From 
outer to inner contours the lines represent isocontours of S /N > 
5 weak-lensing detection of 1, 10, and 100 clusters as a function 
of redshift and mass. Points: a Poisson realization of the above, 
with errors on mass and redshift (these move points outside the 
N = I contour). 

Finally, we also allowed Gaussian photometric errors, taken 
to be 0.02 at all redshifts (see Sec 2.3): 

obsn ~ N{z,0m 2 ) . (16) 

This procedure yielded 10714 clusters with measured 
obslgnlQQj, obslgM200i, errlgM200j, and obszt, that we used to 
determine the relation between richness and mass and its evolu- 
tion. Fig 7 depicts them individually (points). Because of mass 
errors, there are points below the minimal S /N = 5 mass (the di- 
agonal slightly bent line). The average richness (obsnlOO) of the 
simulated sample is 46 galaxies, while the median is 38 galax- 
ies. Two thirds of them are at z < 0.3, where the scatter between 
redshift and photometric redshift is better sampled by our real 
data (sec 2.3). The generated sample does not contain any clus- 
ter with a weak-lensing detection at S /N > 5 at z > 0.62 (Fig 
7). 

3.2. Determining the richness-mass predicted priors 

We now combined the real data from the local Universe with the 
simulated data (depicted in Fig. 7), to compute how well we are 
able to measure the richness-mass scaling at all redshifts. In this 
section we will not use true values because these are unknown 
for the real data. Furthermore, we cannot assume to know how 
the parameters of the richness-mass scaling evolve, because this 
is precisely what we want to infer from the data. 

The information encoded in the local Universe (sect 2.1) is 
the current prior: 

(reseat ~ M0.25,0.03 2 ) (17) 
a ~ JV(0.08,0.04 2 ) (18) 
f3 ~ JV(0.47,0.12 2 ) . (19) 

We assumed that the scatter and the intercept may both 
change with redshift: 

IgnlOOnii a +1.5+ /3(/gM200, - 14.5) + y ln(l + zi) (20) 
IgnlQQi ~ M(l 8 n200m u cj? ntrscat (zd) (21) 

^ntrscaMi) *~ ^mtrscat ~ 1 + C 1 + Zi?* ■ ( 22 ) 



While the adopted modelling of the evolution is common in 
previous works (e.g. Sartoris et al. 2010, Carbone et al. 2012), 
we emphasize that a different modelling is possible and legiti- 
mate. We also emphasize that, as in previous works, we assumed 
to perfectly known the analytic expression of the distribution 
function of the intrinsic scatter term (a Gaussian), when its shape 
should be left more flexible, or at very least, checked with data, 
because this uncertainty may be dominant (Shaw et al. 2010). 
Equation 21 and the fitting code (given in the appendix) may 
be easily modified replacing the adopted Gaussian with a more 
flexible distribution, e.g. by a mixture of two Gaussians, which 
guarantee a valid (positive) probability distribution, unlike the 
Edgeworth series expansion proposed in Shaw et al. (2010). 

We adopted weak priors for the newly introduced parame- 
ters: as prior for the y and £ slopes we adopted a Students t dis- 
tribution centred on zero with one degree of freedom, as for the 
slope ft in sec 2. 1 , to make our choice independent of astronomer 



rules of measuring angles. In formulae: 

7 ~ h (23) 

£ ~ h ■ (24) 

As in previous sections, richness has Poisson errors: 

obsnlOQi ~ P(n200d , (25) 

whereas masses and photometric redshifts have Gaussian errors: 

obslgM200i ~ N(lgM200i,errlgM2QQ 2 ) (26) 

obszi ~ Af(z,0.02 2 ) . (27) 



To complete the model description, we need to specify the 
mass prior. We cannot ignore that the mass function is steep 
and that the weak-lensing S /N > 5 cut introduces an abrupt 
discontinuity: ignoring them would lead to a biased fit (the re- 
covered slope would be much shallower than the input one) due 
to a Malmquist-like bias. Indeed, mass errors tend to make the 
distribution in mass broader, especially at low-mass values, be- 
cause of the sharp S /N = 5 weak-lensing detection requirement, 
but also at high-mass values because of the steepness of the 
mass function. Since high-mass values are overestimated and 
low-mass values are underestimated, any quantity that is fitted 
against these (biased) values neglecting the selection function 
would return a shallower relation (see also Andreon & Hurn 
2010 for the similarly biased mass-richness relation of Johnston 
et al. 2007). For mathematical simplicity and given the small 
mass range explored, we modelled the Jenkins et al. (2001) mass 
distribution at a given redshift as a Schechter (1976) function 
with slope - 1 and characteristic mass given by 

lgM200* = 12.6 - (z - 0.3) (28) 

truncated at lgM200truc, given by eq. 12. The parameters of 
eq 28 were determined by fitting the Jenkins et al. (2001) mass 
function. 

On the other hand, we do not need to model the optical clus- 
ter selection function, because the large cluster richness and the 
photometric depth allow all clusters that produce a detectable 
weak-lensing signal to be easily detectable as overdensities of 
red galaxy because they have, on average, 38 galaxies projected 
on a background of (nearly) zero galaxies. 

We do not need, either, to accurately model the redshift 
prior, because photometric redshifts are well-determined. We 
can therefore assume an uniform distribution for it 

Zi ~ U(0, 1) . (29) 
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Fig. 8. Marginal (panels on the diagonal) and joint (other panels) probability distributions of the mass-richness scaling derived from 
real and simulated data for a PS 1 +Euclid-like survey. Red jagged contours and histograms refer to probabilities computed from 
the MCMC sampling, while the blue smooth contours/lines refer to a Gaussian approximation of the numerically found posterior. 
Contours are at 68% probability level. Vertical (cyan) lines and crosses indicate the values used to generate the data, while the 
dashed (green) lines show the current low-redshift calibration of the richness-mass scaling. 



although we emphasize that for large photometric redshift errors 
one should account for gradients in n(z). 

We emphasize that modelling the mass- and selection func- 
tion is compulsory; not accounting for it would lead to a fitted 
slope » 5cr different from the input one. Therefore, results based 
on methods that do not allow for the accounting of the mass- and 
selection function, e.g. the usual linear regression analysis based 
on BCES (Akritas & Bershady 1996), or simplistic forecast anal- 
yses lacking any treatment of the selection function (as is typical 
of Fisher analyses), should be used with great caution. On the 
other hand, one should not be overly anxious about modelling 
the mass- and selection function: what matters is their general 
shape, which drives the correction of the bias, not their precise 
shape, i.e. whether the mass function is a Tinker et al. (2008) or 
Jenkins et al. (2001) mass function, for instance. The uncertainty 
on the precise shape of the mass function, neglected in this work 
because of the small considered mass range, is an uncertainty 



of secondary importance compared to the large uncertainy in- 
volved through the mass errors. The main point to keep in mind 
is that the mass function is certainly not uniform, it is evolv- 
ing with redshift, the clusters entering in the sample are not a 
random sampling of the mass function (all those with low mass 
are excluded, and the limiting mass is changing with redshift) 
and we account for that (not accounting leads to parameters off 
by » 5<t, as mentioned), while other observable-mass fitting 
models (sometime implicitely) assume a uniform prior on clus- 
ter mass and mass-random selection, unless differently specified. 

The software implementation of this fitting model is given in 
the appendix. 

Fitting the simulated+real data with this model returns pa- 
rameters whose (posterior) probability distributions are depicted 
in Fig. 8. Fig. 8 and its summary in Table 1 are one of the main 
results of this work, since they are the priors (starting points) 
needed to forecast cosmological parameters with cluster data. 
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Marginal probabilities are shown on the diagonal, while the 
other panels show the joint probability distributions, i.e. the co- 
variance between pairs of parameters. Each panel reports two 
closely packed lines: the red one is the Laplace (Gaussian) ap- 
proximation of the posterior, while the histogram/jagged contour 
is the straight outcome of the numerical computation (some- 
what noisy because of the finite length of the MCMC chain). 
The Laplace (Gaussian) approximation captures the probability 
distributions well. 

The diagonal panels also show the input values (vertical 
lines). They are all within 1 .5 posterior standard deviations from 
the recovered valued]. By fitting the observed data, we recover 
the five parameters that describe the mass-richness scaling with 
good accuracy and without bias. 

In addition to input values, the diagonal panels show the 
current low-redshift calibration of the richness-mass scaling 
(dashed green line). Euclid masses significantly improve the cur- 
rent low-redshift calibration of the richness-mass scaling: the in- 
tercept a, currently known to within 10 % (0.04 dex, sec 2.1), 
will be known with a per cent accuracy, the slope f3, currently 
known with a sizeable uncertainty (0.47+0.12, sec 2.1) will have 
its uncertainty reduced by a factor 10. The intrinsic scatter, cur- 
rently known with a ~ 10 % accuracy (sec 2.1), will be known 
with a per cent accuracy. The evolution of the intrinsic scatter 
and of the intercept will be known with a 0.03 and 0.005 uncer- 
tainty, respectively. The computed posterior is ~ 10 3 times nar- 
rower (in the a -ft - cr space) than the current calibration of the 
richness-mass scaling, a significant improvement over the cur- 
rent low-redshift calibration. This capability makes the Euclid 
mission unique and independent of the success of observations 
other than the already acquired PanStarrs 1 multicolour data. 
Instead, the calibration of the mass-proxy relation of the XXLS 
cluster survey (Pierre et al. 201 1) must rely on the success of an 
expensive XMM calibration program (Pierre et al. 201 1), which 
is not yet implemented. Similarly, the SPT survey requires an 
external calibration. Although the current clusters sample con- 
sists only of 100 clusters (Reichardt et al. 2012), the currently 
available calibration, not the sample size, is the main source of 
uncertainty in cosmological estimates. 

There is a strong covariance between the evolution and the 
z — value of the intercept (y - a panel of Fig 8). It can be 
easily understood by noting that z — is outside the range of 
sampled redshifts. The covariance between intrinsic scatter and 
its evolution - cr in , rscat panel of Fig 8) has a similar origin: 
the intrinsic scatter is defined at an un-observed redshift, z — 0, 
instead of a redshift where it is well observed. 

Fig 9 compares the model fit (solid line) to the true input re- 
lation in stacks of 201 clusters per point. The model fit on noisy 
data and the (unobserved and unused in the analysis) noise-less 
data agree well, indicating that the fit to the noisy data captures 
the real trend of the noise-less true data well. 

In summary, by fitting observed data we recover with good 
accuracy and without bias the five parameters describing the 
mass-richness scaling. In particular, we assumed no evolution 
(i.e. y = and £ = 0) and recovered it, even allowing evolution 
on both scatter and intercept. We will be able to measure the 
mass-richness scaling with an error (posterior parameter stan- 
dard deviation) of 0.007, 0.014, 0.005, 0.033, and 0.005 in a, (3, 
intrinsic scatter, y, and respectively. These are the predicted 



4 There is only a 10 % probability that in a five parameter fit all fitted 
values are found within 1 cr from the input values, and a 50 % probabil- 
ity that they are all within 1.5 cr. 
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Fig. 9. Richness-mass scaling for the simulated PS 1 +Euclid-like 
data. The solid line marks the regression line fitted on observed 
data. The shaded region marks the 68% highest posterior credi- 
ble interval for the regression. The red dashed line indicates the 
input relation. The data points are stacks of true data in bins of 
201 clusters each, true data were never used in the fitting. 

prior widths of cosmological forecasts. Table 1 lists the covari- 
ance matrix. 

Strictly speaking, conclusions of this sub-section only hold 
if our modelling of the richness-mass scaling is a reasonable ap- 
proximation of the scaling in the real Universe. Therefore it does 
not hold if, for example, the richness-mass scaling suddenly dis- 
appears in the real Universe at z — 0.3, for instance. 

3.3. What happens ifa intrsca , doubles by z- 0.6? 

To understand how the predicted prior is sensitive to a possi- 
ble evolving mass-proxy scaling, we generated new data with 
£ = 0.18, i.e. generated from a relation whose intrinsic scatter is 
twice as large at z = 0.6 as at z = 0. To this aim, we replaced 
equation 14 by 

IgnWQt ~ M0.47 (lgM200i - 14.5) + 1.58,cr? f „ caf (z«)ft0) 

^intrscatiZi) «" 0.25 2 - 1 + (1 + Z,) 2 ^ (31) 

£ <- 0.18 (32) 

and re-generated the new (simulated) data. We fitted 
real+simulated data with no change whatsoever, and, as for a 
non-evolving intrinsic scatter, we recovered the input parame- 
ters, finding £ = 0.16 ± 0.01 (vs input £ = 0.18). The other four 
parameters were all recovered to better than their uncertainty. 
More precisely, we found an error of 0.01, 0.02, 0.008, 0.046, 
and 0.010 in a, /3, intrinsic scatter, y, and (, respectively. These 
are larger (1.5 times, on average) than before because with the 
larger scatter (at high redshift) more data are needed to measure 
the mean relation with the same precision. Nevertheless, the pa- 
rameter volume they encompass is only a factor nine larger than 
for a non-evolving intrinsic scatter, a negligible factor (a mere 
0.01 per cent) compared to what we discuss below. Marginal and 
joint probability distributions (i.e. the covariance matrix and Fig 
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Table 1. Predicted richness-mass prior parameters for a 
PS 1 +Euclid-like survey: covariance matrix cry 
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8 revised for these data) are qualitatively similar, apart from the 
obvious « 1.5 factor. 

In summary, an increasing intrinsic scatter, if present, would 
be easily recovered from the data, with only a mild degradation 
of the overall performances (a factor 9 for a five dimensional 
volume), and no bias. 

4. Discussion 

Our computation of the predicted prior of the mass-richness scal- 
ing while not accounting for sub-dominant sources of error, such 
as uncertainties related to projection and redshift-dependent er- 
rors on the cluster photometric redshift, can easily take them into 
account, it is just a matter of replacing the assumed likelihoods 
distributions (Eqs. 25 to 27) with the updated distributions ac- 
counting for additional error terms one may wish to consider. 
For example, we can change the normal intrinsic scatter (ques- 
tioned by Shaw et al. 2010) into a Student-f distribution by typ- 
ing less than ten characters (see Appendix for details). However, 
more complex simulated data (e.g. based on an N-body simula- 
tion) are needed to generate the data to be fitted and more and 
better real data are needed to characterize the real additional de- 
pendencies (e.g. how to model the intrinsic scatter). 

The starting point of literature forecasts is the end point of 
this paper: they assume what our paper computes, their prior 
widths are our posterior parameter uncertainties. The predicted 
prior is computable and thus does not need to be assumed. 
Parameters show covariance, sometimes a strong one, while 
none is assumed in literature forecasts (that we are aware of). 

We note that the previous literature (starting perhaps with 
Lima & Hu 2005) chose not to model the slope between mass 
and proxy, i.e. implicitly assumed to know it perfectly. This as- 
sumption seems optimistic because the slope is presently known 
with 25 % accuracy (sect 2.1, summarized in eq 19). Sect 2.3 
shows that it will be known after PSl+Euclid with a per cent 
accuracy. If a perfect knowledge of the slope is assumed, then 
uncertainties on the other scaling parameters (scatter, intercept, 
and their evolution) will be underestimated. Furthermore, while 
the quality of a mass proxy is lower at the ends of the calibra- 
tion range because of the slope uncertainty, the choice performed 
in previous literature works makes it a constant quality at all 
masses, including those outside the range of the calibration sam- 
ple. 

As mentioned above, most previous works (e.g. Lima & Hu 
2005, Cunha & Evrard 2010, Thomas & Contaldi 201 1, Carbone 
et al. 2012, etc.) adopted priors for the mass-observable scal- 
ing largely by guessing how well the relation is (or will be) 
known, instead of computing the prior width. Sometimes, the 
prior width on some key parameters, like the scatter, was taken 
to be zero. Some works (e.g. Cunha & Evrard 2010, Oguri 
& Takada 2011) explored the sensitivity of cosmological con- 
straints on the adopted priors for the mass-observable scaling, 
sometimes calling this sensitivity "systematics", quantifying the 
(obvious) fact that poorly calibrated scaling relations give poor 



cosmological constraints. For example, Cunha & Evrard (2010) 
showed that cosmological constraints easily deteriorate by a fac- 
tor from V2 to 2 by changing the prior width from zero to ~ 1 
%. 

Even more important, previous forecasts did not use the in- 
formation content in the weak-lensing masses to calibrate the 
mass-observable scaling. For comparison, we consider the pri- 
ors assumed in Carbone et al. (2012), who also considered the 
mass-richness scaling of a Euclid-like survey, but made no use 
of the Euclid weak-lensing masses to calibrate the mass-richness 
scaling. Before proceeding in this comparison, we emphasize a 
technical difference: the two modellings are identical after swap- 
ping observable and mass variables. For example, we modelled 
the scatter in proxy at a given mass as Gaussian, while Carbone 
et al. (2012) modelled the scatter in mass at a given proxy as 
Gaussian. Since the Carbone et al. (2012) model has no slope 
parameter, for the purpose of this comparison only, we removed 
the slope from the modelling (freezing it at the true value). 

Fig 10 compares the prior adopted in Carbone et al. (we em- 
phasize once more the variable swapping) with our predicted 
prior. A major point emerges: the parameter volume encom- 
passed by the Carbone et al. prior, which does not use weak- 
lensing to calibrate the mass-richness scaling, is 10 5 times larger 
(in the a — cr — y — ( space) than the one we derive using Euclid 
weak-lensing masses. Similarly, the Euclid imaging consortium 
science book (EICSB, Refregier et al. 2010) does not use the 
Euclid weak-lensing masses to calibrate the mass-richness scal- 
ing and assumes a 25 %, or 0.25, prior uncertainty on each pa- 
rameter of an observable-mass relation modelled with a third, 
different, parametrization. At face value, given that our preci- 
sions are typically one order of magnitude better per param- 
eter, using weak-lensing masses may allow us to improve the 
knowledge of the observable-mass scaling by a similarly large 
(=s 10 5 ) amount. If the mass-proxy scaling can be computed 10 s 
times better, stronger cosmological constraints can probably be 
inferred and this may alter the balance between the cosmolog- 
ical constraints achievable using cluster counts, BAOs, SNae, 
and weak-lensing tomography. Indeed, Carbone et al. (2012) es- 
timated that if the regression parameters were perfectly known, 
then cosmological constraints tighten (technically: the volume of 
cosmological parameter space enclosed by the posterior proba- 
bility distribution decreases) by a factor « 100 compared to the 
case where one marginalises over their (extremely wide) prior 
(see their Table 2). The gain on the constraints on dark energy 
parameters alone is instead only mild: a factor 2. The precise 
computation of the gains in our specific case is, however, out- 
side the aim of this work. 



5. Summary 

The aim of this work was threefold: first, using 53 clusters with 
individual measurements of mass, we derived the richness-mass 
scaling in the local Universe. We found a 0.46 + 0.12 slope and a 
0.25 ± 0.03 dex scatter in (log) richness at a given mass measur- 
ing the richness following the Andreon & Hurn (2010) prescrip- 
tions. The fit accounts for the fact that the cluster sample is X-ray 
selected and massive clusters are over-represented, although we 
found that the sample selection is a minor source of concern for 
this sample. Because the scatter around the regression is derived 
from measurements of the individual masses and richnesses, our 
measurement of the scatter is preferable to others derived with- 
out knowledge of individual cluster masses, such as those of the 
maxBCG team (e.g. Rykoff et al. 201 1). 
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Fig. 10. Example of mismatch between predicted prior of the richness-mass scaling, as derived by us (in blue) and as adopted in 
other works (in red, from Carbone et al. 2012). This comparison should be seen as indicative only, because of differences between 
the two mass-proxy modellings. 



Secondly, using 250 0.06 < z < 0.9 clusters with spectro- 
scopic redshift, mostly at z < 0.3, we found that the cluster red- 
shift can be derived with an accuracy with better than Az = 0.02 
from the colour of the red sequence. 

Thirdly, we computed the predicted prior between mass and 
richness, i.e. one of the input ingredients to judge how strongly 
future surveys using clusters may constrain cosmological param- 
eters, and to which extent clusters can compete with other cos- 
mological probes. 

To this aim, we generated a simulated universe obeying 
the derived richness-mass scaling, observed it with a mock 
PanStarrs 1 +Euclid-like survey, allowing for intrinsic scatter be- 
tween regressed quantities, allowing for mass and richness er- 
rors, and also allowing for cluster photometric redshift errors. 
The generated sample does not contain any cluster with an 
S /N > 5 weak-lensing detection at z > 0.62 (Fig 7). 

We fitted the observations with an evolving richness-mass 
scaling with five parameters to be determined. We allowed an 
evolution in the intercept (sometime called bias) and intrinsic 
scatter. We allowed an uncertainty on the intrinsic scatter and on 
the intercept, as previous works, but in contrast to all previous 
approaches, we did not sidestep the modelling of the slope. 

Our fitting model recovers the input parameters, but only 
if the cluster mass function and the redshift-dependent S /N > 
5 weak-lensing survey selection function are accounted for. 
Neglecting them causes fit values to deviate by > 5<x from the 
input values, as a result of the neglected Malmquist-like bias. 
This result emphasizes the limitations of often adopted simpli- 
fying assumptions, such as mass-complete redshift-independent 



samples. Including the optical selection function is unnecessary 
because all clusters with a weak-lensing signature are so mas- 
sive and rich that detecting their red galaxy overdensity is trivial. 
Already available imaging data from PanStarrs 1 are of sufficient 
quality to detect these galaxies, whereas mass estimates await 
the Euclid mission. 

We derived the uncertainty and the covariance matrix of the 
(evolving) richness-mass scaling, which are the input ingredi- 
ents of every cosmological forecast using cluster counts. These 
five parameters will be known with percent precision thanks to 
masses estimated from Euclid data. There are non-negligible co- 
variance terms between the five regression parameters. These 
numbers, listed in Table 1, are the third main result of this work. 
Their determination does not require the success, or acquisition, 
of other data presently not available, which is requested for other 
cluster surveys, such as the XXLS and SPT survey. 

We found that the richness-mass scaling parameters can be 
determined 10 5 better (the volume enclosed by the posterior is 
10 5 times smaller) than estimated before without using of weak- 
lensing mass estimates, although we emphasize that this number 
was derived using scaling relations with different parametriza- 
tions. A better knowledge of the scaling parameters likely has a 
strong impact on the relative importance of the different probes 
used to constrain cosmological parameters. 

Finally, we checked that if the intrinsic scatter between mass 
and richness increases by a factor two by z = 0.6, we are nev- 
ertheless able to recover the mass-richness scaling without bias, 
with only a factor 9 (about 1.5 per parameter) degradation in the 
quality with which we are able to recover the scaling parameters. 
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The fitting code, inclusing of the treatment of the mass func- 
tion and the weak-lensing selection function, is provided in the 
appendix. It can also be re-used, for example, to derive the pre- 
dicted prior of other observable-mass scalings, such as the Lx- 
mass relation. 
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Appendix A: Model for computing the predicted 
prior of the mass-richness scaling including the 
weak-lensing selection function 

Eq 12 and 17 to 29 are almost literally translated into JAGS 
(Plummer 2008), Poisson, normal, and uniform distributions be- 
come dpois, dnorm, dunif, respectively. JAGS0, following 
BUGS (Spiegelhalter et al. 1995), uses precisions, prec = 1/cr 2 , 
in place of variances a 2 . The only complication comes from 
sampling from a distribution unavailable in JAGS, a truncated 
Schechter function. This is achieved by exploiting the prop- 
erty that a Poisson($>) observation of zero has a likelihood . 
Conseguently, if our observed data are a set of 0's, and <j>[i] 
is set to — log .£[/], we obtain the correct likelihood contribu- 
tion. The quantity A[i] should always be greater than 0, because 
it is a Poisson mean, and we may accordingly need to add a 
suitable constant, C, to ensure that it is positive. The quantity 
lglOtot. norm normalises the integral of the obslgM200 like- 
lihood to one. The model (set of equations) reads in JAGS: 



5 http ://cal vin .iarc . f r/~ marty n/sof t ware/j ags/ 
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data 
{ 

preclgM200 <- l./(errlgM20©"2) 

# normaliz 

lglOtot.norm <-0. 386165-3 . 92996*obsz-0. 247050*obsz~ 2-2 . 5 5814*obsz" 3-5 . 26633*obsz"4 

# dummy variable for zero-trick, to sample from a distribution not available in JAGS 
for (i in 1 : length(obslgM200)) { 

dummy [i] <-© 
} 

C<-2 
} 

model 

{ 

intrscat " dnorm(0. 2 5, 1/0.03/0.03) 
prec.intrscat <- l/intrscat"2 
alpha " dnorm(0. 08, 1/0.04/0.04) 
beta ~ dnorm(0.47,l/0.12/0.12) 
gamma "dt(0 ,1,1) 
csi ~dt(0,l,l) 

for (i in 1 : length(obsn200)) { 

# modelling lgM200 

# dummy prior, requested by JAGS, to be modified later 
lgM2O0[i] ~ dunif(13.9891+1.84936*obsz[i]+8.488881*obsz[i]"2, 16) 

# modelling a truncated schechter 

lnnumerator[i] <- -(18"(8.4*(lgM288[i]-12.6+(obsz[i]-8.3)))) 

# its integral, from the starting point of the integration (S/N=5) 
loglike[i] <- -lnnumerator[i]+lgl0tot.norm[i]*log(lO)+C 

# sampling from an unavailable distribution 
dummy[i] " dpois(loglike[i]) 
obslgM20O[i] " dnorm(lgM200[i] ,preclgM20O[i]) 

# modelling n200, z and relations 
obsn20O[i] " dpois(pow(10, lgn200[i])) 
obsz[i] " dnorm(z[i] ,pow(0.02,-2)) 
z[i] "dnorm(0, 1) 

# modelling mass -n200 relation allowing for evolution 
lgn2O0m[i] <- alpha+1.5 +beta*(lgM200[i] -14. 5)+ gamma*(log(l+z[i])) 
lgn2O0[i] " dnorm(lgn20Om[i] , prec. intrscat. z[i]) 

prec. intrscat. z[i] <- l/( l/prec.intrscat-l+(l+z[i])'~(2*csi)) 
} 

} 

To adopt a Student's f-distribution with ten degrees of free- 
dom dt to model the intrinsic scatter (Sect. 4), it suffices to re- 
place the line starting by lgn200 [i] with 

lgn20O[i] " dt(lgn20Om[i] , prec. intrscat. z [i] , 10) 



